Multiclass Cancer Classification Using Gene Expression Profiling and Probabilistic Neural Networks
نویسندگان
چکیده
Gene expression profiling by microarray technology has been successfully applied to classification and diagnostic prediction of cancers. Various machine learning and data mining methods are currently used for classifying gene expression data. However, these methods have not been developed to address the specific requirements of gene microarray analysis. First, microarray data is characterized by a high-dimensional feature space often exceeding the sample space dimensionality by a factor of 100 or more. In addition, microarray data exhibit a high degree of noise. Most of the discussed methods do not adequately address the problem of dimensionality and noise. Furthermore, although machine learning and data mining methods are based on statistics, most such techniques do not address the biologist's requirement for sound mathematical confidence measures. Finally, most machine learning and data mining classification methods fail to incorporate misclassification costs, i.e. they are indifferent to the costs associated with false positive and false negative classifications. In this paper, we present a probabilistic neural network (PNN) model that addresses all these issues. The PNN model provides sound statistical confidences for its decisions, and it is able to model asymmetrical misclassification costs. Furthermore, we demonstrate the performance of the PNN for multiclass gene expression data sets. Here, we compare the performance of the PNN with two machine learning methods, a decision tree and a neural network. To assess and evaluate the performance of the classifiers, we use a lift-based scoring system that allows a fair comparison of different models. The PNN clearly outperformed the other models. The results demonstrate the successful application of the PNN model for multiclass cancer classification.
منابع مشابه
A Probabilistic Neural Network for Gene Selection and Classification of Microarray Data
In this paper, we present the mathematical foundations of a probabilistic neural network for gene selection and classification of high-dimensional microarray data. We present a catalogue of features that a classification system for microarray data should incorporate. We then use this catalogue and compare the theoretical properties of probabilistic neural networks with support vector machines w...
متن کاملA Performance Analysis of Cancer Classification Using Feature Extraction and Probabilistic Neural Networks
Accurate diagnosis and classification is the key issue for the optimal treatment of cancer patients. Several studies demonstrate that cancer classification can be estimated with high accuracy, sensitivity and specificity from microarray-based gene expression profiling using artificial neural networks. In this paper, a comprehensive study was undertaken to investigate the capability of the proba...
متن کاملSTUDY OF HMGA2 GENE INHIBITION WITH SPECIFIC SHRNA AND SIRNA AND INVESTIGATION OF CORRESPONDING EFFECTS ON DOWNSTREAM GENE EXPRESSION IN MDA-MB-231 CANCER CELLS: A BIOINFORMATIC AND EXPERIMENTAL STUDY
Background & Aims: The use of siRNA to silence gene expression is increasingly expanding today. The aim of this study is to bioinformatically and experimentally investigate the inhibition of the HMGA2 gene and its corresponding effects on downstream genes expression rate in MDA-MB-231 cancer cell treated by shRNA and siRNA specific to HMGA2. Materials & Methods: To perform this bioinformatic a...
متن کاملPCA disjoint models for multiclass cancer analysis using gene expression data
MOTIVATION Microarray expression profiling appears particularly promising for a deeper understanding of cancer biology and to identify molecular signatures supporting the histological classification schemes of neoplastic specimens. However, molecular diagnostics based on microarray data presents major challenges due to the overwhelming number of variables and the complex, multiclass nature of t...
متن کاملTumor Disease Multiclass Prediction using Biomolecular Gene Expression Data by Signal Processing and Computational Intelligence Techniques
Article history: Received Accepted Available online 10 Dec. 2014 15 Jan. 2015 20 Jan. 2015 Tumor disease multiclass prediction from nucleotide expression is an emerging research area in the field of bioinformatics. Gene expression profiling has been emerged as an efficient technique for cancer classification as well as for diagnosis, prognosis, and treatment purposes. Studying cancer microarray...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
دوره شماره
صفحات -
تاریخ انتشار 2003